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Executive  Summary 

Objective  The  authors  proposed  to  develop  mathematics  and  algorithms 
which  will  allow  a  single  operator  to  teleoperate  a  team  of  robots,  using 
information  obtained  by  cameras  on  the  robots. 

Approach 

This  STIR  project  had  several  components: 

•  Solve  the  Self  Localization  problem  relative  to  a  local  coordinate 
system,  using  mutual  distance  information. 

•  Determine  the  Field  of  View  of  each  camera,  using  the  pose  infor¬ 
mation  already  determined. 

•  Determine  which  cameras  are  on  the  leading  edge. 

•  Construct  the  panoramic  view  and  provide  that  view  to  the  operator. 

•  From  the  joystick  command,  construct  a  force  field  and  distribute 
that  force  to  each  robot. 

•  For  each  robot,  make  an  appropriate  movement. 

•  Provide  feedback  to  the  user  in  the  event  of  unfulfillable  motion 
commands. 

Relevance  This  proposal  is  in  response  to  the  ARO  Broad  Agency  An¬ 
nouncement,  Sections  5.2  and  3.5. 

Control  of  a  team  of  robots  by  a  single  operator  is  particularly  important 
in  the  urban  warfare  missions  envisioned  by  the  Future  Combat  Systems 
program.  Such  teams  are  expected  to  participate  in  a  variety  of  surveillance 
and  materiel  transport  missions. 

The  need  for  work  in  this  area  is  exemplified  by  the  fact  that  the  U.S. 
Military  has  one  component  directly  addressing  problems  similar  to  those 
addressed  in  this  proposal,  the  “Multi-robot  Operator  Control  Unit”  in  de¬ 
velopment  at  the  Space  and  Naval  Warfare  Systems  Center,  San  Diegofl]. 
Control  of  multiple  robots  in  Army  Applications  is  also  being  addressed  by 
the  SWARMS  MURI  at  the  University  of  Pennsylvania,  as  well  as  projects 
within  the  Army  Tank  Automotive  Research  and  Development  Center  and 
the  Army  Research  Laboratory. 
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1.  Introduction 

In  this  paper,  we  report  on  the  initial  development  of  the  component  capa¬ 
bilities,  especially  the  imaging  capabilities,  in  Computer  Vision,  Autonomy, 
and  Teleoperation  for  the  long-term  goal  of  operating  a  team  of  robots  for 
purposes  of  surveillance  or  materiel  transport.  The  team  is  a  collection  of 
five  to  fifty  monocular  mobile  robots  that  are  jointly  controlled  by  a  single 
user  with  a  joystick.  Each  robot  communicates  with  nearby  robots,  could 
sense  the  terrain  in  its  vicinity  proprioceptively,  and  can  coordinate  with 
other  robots  to  avoid  obstacles  as  well  as  friendly  assets. 

In  this  effort,  we  focussed  on  the  image  sensing  opportunities  provided  by 
such  a  team  of  monocular  mobile  robots  and  the  computer  vision  capabilities 
required  to  exploit  those  opportunities.  No  effort  was  expended  on  SLAM, 
and  little  on  control. 


2.  Objective 

Most  of  the  effort  on  this  project  involved  the  particular  subproblem: 
Automatic  composition  of  a  panoramic  mosaic.  The  operator  must 
be  able  to  naturally  specify  a  direction  to  observe,  the  corresponding  subset 
of  individual  cameras  of  robots  on  the  periphery  facing  the  specified  direction 
must  be  identified,  and  the  information  in  the  selected  images  must  be  fused 
for  presentation  to  the  human  operator  in  a  way  which  the  human  can  readily 
grasp  and  conveniently  use. 

To  fully  accomplish  control  of  a  robot  team,  we  would  have  to  incorporate 
advanced  sensing  with  path  planning  and  control.  That  implies  a  list  of 
objectives  too  ambitious  to  be  completed  by  a  small  team  of  researchers  in 
a  brief  interval  of  time;  this  study  therefore  limited  itself  to  the  imaging 
component. 


3.  Approach 

While  recognizing  the  importance  of  fusing  of  sensing  and  control,  we 
focussed  on  the  sensor  suite  here  rather  than  on  any  particular  part  of  con¬ 
trol,  guidance,  path  planning,  or  path  following.  Other  groups  are  working 
on  these  latter  problems  and  we  hope  to  take  advantage  of  their  results  if 
we  are  successful  in  demonstrating  the  computer  vision  capabilities  that  we 
have  proposed  above,  especially  panorama  composition,  which  we  regard 
as  a  milestone  accomplishment  to  justify  subsequent  effort.  To  our  knowl¬ 
edge,  no  one  is  working  on  any  such  real-time,  joystick-directed  composition 
of  panorama  mosaics  from  an  optimally  collected  subset  of  a  collection  of 
monocular  mobile  robots  which  are,  in  general,  dispersed  in  an  arbitrary, 
ad-hoc  distribution  of  locations. 

We  developed  an  algorithm  for  determining  how  to  position  and  direct 
cameras  in  such  a  way  that  the  entire  external  field  of  view  is  guaranteed 
to  be  observed.  The  mathematical  details  are  presented  in  the  Results  sec¬ 
tion.  This  involved  determining  camera  pose  (algorithm  1)  and  camera  focal 
length  (algorithm  2). 


4.  Background 


Though  the  problem  of  making  sense  of  information  from  distributed  cam¬ 
eras  is  relatively  new,  already  there  has  been  one  international  conference 
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on  the  topic  (held  in  Vienna  in  September  of  2007),  and  is  about  to  be  a 
second  (Stanford  in  September,  2008).  The  problem  requires  some  sort  of 
consistency  analysis,  for  example,  using  consistent  labeling  [2]  or  game  the¬ 
ory  [3].  As  described  in  the  appendix,  our  approach  also  uses  consistency, 
an  extension  of  the  generalized  Hough  transform[4]. 

To  construct  the  panoramic  view  envisioned,  one  might  begin  by  evalu¬ 
ating  the  applicability  of  image  stitching  algorithms.  Xing  and  Miao  [5]  use 
SIFT  features  in  a  way  similar  to  the  way  our  SKS  [6]  uses  local  neighbor¬ 
hoods.  The  reader  is  referred  to  [7]  for  an  excellent  review  of  the  stitching 
literature,  however,  in  this  project,  we  determined  an  alternative  approach 
to  stitching,  which  will  be  discussed  in  the  Results  section  of  this  report. 

The  concept  of  Visual  SLAM  is  also  relatively  new:  There  was  a  workshop 
on  Visual  SLAM  at  the  Intelligent  Robotic  Systems  Conference  in  2007  (San 
Diego,  October,  2007).  Although  we  mention  the  word,  this  project  is,  in 
itself,  not  a  project  in  SLAM. 

Because  we  have  a  human  in  the  loop,  we  do  not  have  to  solve  the  SLAM 
problem  globally.  However,  we  must  maintain  a  rough  description  of  the 
pose  (position  and  orientation)  of  each  robot,  at  least  relative  to  the  other 
robots,  and  we  must  update  those  pose  estimates  as  the  team  moves.  This 
is  especially  challenging  for  team  members  which  are  not  part  of  the  leading 
edge  (and  therefore  do  not  necessarily  provide  image  input  to  the  human 
operator).  This  will  require  some  aspects  of  SLAM.  Furthermore,  in  order 
to  determine  the  leading  edge,  relative  SLAM  is  required. 

The  SLAM  literature  distinguishes  between  “metric  maps”  which  describe 
the  environment  by  distances  between  points  [8,  9,  10],  and  those  that  use 
“topological”  infornration[ll,  12,  13].  The  former  approach  becomes  quickly 
overwhelmed  by  computational  complexity,  but  the  latter  strategies  are  not 
particularly  accurate.  Our  approach  will  be  to  represent  the  pose  of  the  indi¬ 
vidual  robots  by  particle  filter-like  representation  [14,  10]  for  the  pose,  which 
will  allow  a  probabilistic  representation  for  the  pose.  Interactions  between 
robots  will  be  described  by  a  topologic  map,  allowing  the  incorporation  of 
metric  information[15,  16,  13]. 


5.  Results 

5.1.  Visibility.  In  this  section,  we  present  derivations  which  show  how  to 
view  the  entire  environment  around  the  robot  team.  We  show  that  it  is 
possible  to  have  complete  coverage  outside  the  team  provided  at  least  some 
subset  of  robots  forms  a  convex  polygon.  These  theoretical  results  are  pre¬ 
sented  here.  Experiments  have  confirmed  the  effectiveness  of  these  methods, 
using  Blender  simulation. 

Assume  we  are  to  operate  a  team  of  holonomic  robots.  For  now,  assume 
the  operational  terrain  is  approximately  planar  and  horizontal.  Each  robot 
is  equipped  with  a  single  camera  which  can  pan  and/or  zoom.  We  model 
these  cameras  with  pinhole  cameras  with  adjustable  focal  length.  We  are 
concerned  with  camera  formations  which  maintain  “complete  external  visi¬ 
bility.”  Define  d  to  be  the  convex  hull  of  the  set  of  cameras. 

Definition  An  ray  is  a  vector  in  the  ground  plane,  passing  through  a  camera 
center  and  through  the  finite  focal  plane  of  that  camera,  and  an  exterior  ray 
is  a  ray  into  the  exterior  of  d. 
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Figure  1.  A  set  of  7  cameras  defining  a  convex  polygon. 
Camera  i  is  denoted  c*,  and  has  angular  field  of  view  of  0j. 


Definition  A  set  of  cameras  has  complete  external  visibility  if  there  exists  an 
exterior  ray  in  every  possible  direction.  It  should  be  noted  that  our  use  of  the 
word  “visibility”  is  different  from  that  of  Flocchini  et  al.  [17]  who  were  also 
interested  in  a  ring  of  sensors.  In  their  terminiology,  “complete  visibility” 
means  every  sensor  can  see  every  other  sensor,  whereas  our  definition  means 
the  collection  of  sensors  can  see  everything  (that  is  not  occluded)  outside 
the  ring. 

We  will  examine  a  particular  arrangement  of  cameras  in  this  context.  Only 
cameras  which  are  on  the  convex  hull  will  be  of  interest.  That  set  (also 
denoted  by  8 ),  defines  a  convex  polygon  such  as  the  one  shown  in  Figure  1. 
In  this  example,  8  =  { ci,C2 ,  •••  ,07}  and  |<9|  =  7.  At  each  camera  Cj,  the 
exterior  angle  is  shown  and  denoted  6\.  We  will  also  use  the  terminology  c\ 
later  in  this  paper  to  denote  the  coordinates  of  camera  i. 

Theorem  1  The  subset  of  cameras  making  up  the  convex  hull  of  a  set 
of  cameras  provides  complete  exterior  visiblity  if  the  field  of  view  of  each 
camera  is  the  external  angle  of  the  polygon  at  that  point. 

Proof.  Consider  the  two  cameras,  C5  and  c 6,  shown  in  Figure  1.  Since  65  is 
the  angular  field  of  view  of  camera  5,  and  6§  is  similarly  the  FOV  of  camera 
6,  and  the  FOV’s  do  not  overlap,  then  camera  5  and  6  together  observe 
all  angles  between  0  and  65  +  Oq.  By  induction,  and  since  the  sum  of  the 
exterior  angles  of  a  polygon  is  27r,  then  every  direction  is  observed.  □ 

Using  a  similar  argument,  we  note  that  each  direction  is  observed  only 
once. 

Algorithm  1,  Generating  complete  external  coverage 

(1)  Compute  the  convex  hull  of  the  set  of  cameras.  This  problem  is 
0{n  log  n )  where  n  =  |<9|. 

(2)  For  camera  i.  orient  the  camera  so  the  left  side  of  its  FOV  is  colinear 
with  the  vector  to  camera  i  —  1.  If  7  =  1,  align  with  camera  n. 
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(3)  set  the  focal  length  of  camera  i  so  that  the  right  side  of  its  FOV  is 
colinear  with  the  vector  to  camera  i  +  1.  if  i  =  n,  align  with  camera 

1. 

Algorithm  1  will  provide  complete  coverage,  but  there  is  a  problem:  changing 
the  FOV  zooms  the  observed  image,  so  objects  in  the  scene  will  appear  to 
grow  or  shrink  in  comparison  to  similar  objects  at  a  similar  range  viewed 
by  other  cameras. 

5.2.  Correction  for  different  camera  focal  lengths.  The  fact  that  each 
camera  potentially  has  a  different  focal  length  leads  to  the  following  situa¬ 
tion:  Camera  i  views  half  of  an  object,  and  camera  i  +  1  views  the  other 
half.  In  a  composite  image,  the  two  halves  would  potentially  different  sizes. 
This  is  guaranteed  to  happen  unless  all  the  cameras  have  the  same  focal 
length;  which  can  only  happen  if  the  contour  is  a  regular  polygon.  So,  to 
make  it  appear  that  the  same  object  has  the  same  size,  even  though  the 
focal  length  has  changed,  we  must  resample  to  focal  plane,  so  the  the  angle 
subtended  by  a  single  pixel  is  a  constant.  We  accomplish  that  resampling 
by  considering  the  zoom-in  and  zoom-out  as  separate  cases. 

First  let  there  be  a  “standard”  focal  length  for  each  camera.  This  might, 
for  example,  be  the  focal  length  that  produces  a  FOV  of  7r/4.  Then  all 
cameras  will  have  focal  lengths  that  are  relative  to  this. 


5.2.1.  Zoom-out.  Figure  2  illustrates  the  zoom-out  case;  shortening  the  focal 
length.  The  width  of  the  focal  plane  in  standard  position  is  known.  For 
example,  it  might  be  17.5mm.  Similarly,  the  focal  length  /o  and  FOV,  0O, 
are  known.  After  the  zoom,  only  the  FOV,  8  is  known.  From  the  small  right 
triangle,  we  have 

(1)  x  =  x0  -  (/0  -  /)  tan  80 

So  the  size  ratio  of  a  particular  object  on  the  focal  plane  is  t/tq,  and 


(2) 


Axiltan()„ 

x0 


But  in  general,  we  don’t  have  /  available,  and  must  use  8,  so  the  pixel 
contraction  is 


*  _  i  (/o-  tiFe) 


a  =  —  =  1  — 
.to 


tan  8q 


5.2.2.  Zoom  In.  In  this  case,  To  =  / tan#,  and  therefore  /  =  but 

(4)  dx  =  /  tan  9q  —  To 

(5)  =  — — — — r  tan  $0  *£o 

tan  9 

(6) 

and  the  pixel  dialation  is 

To  T  dx  tan 

(7)  a  =  - =  - — 

To  tan  9 

Observation:  For  0  — >  0,  using  equation  7,  the  dilation  goes  to  infinity,  and 
the  entire  scene  is  viewed  in  a  single  pixel.  For  9  —>  inf,  using  equation  3, 
the  contraction  goes  to  zero,  and  every  pixel  views  the  entire  scene. 


Figure  2.  Zoom-out:  the  focal  length  is  shortened  from  /o 
to  /,  moving  the  focal  plane  closer  to  the  focal  point.  Before 
the  zoom,  the  length  of  the  focal  plane  was  2xa,  subtending 
an  angle  of  20q.  (only  half  of  the  focal  plane  is  shown  in 
this  right  triangle.  After  the  zoom,  the  same  portion  of  the 
visual  field  only  occupies  a  length  of  x  on  the  focal  plane.  So 
the  camera  can  see  more,  but  a  given  object  occupies  fewer 
pixels  on  the  focal  plane. 


Algorithm  2,  resampling  to  make  objects  appear  the  same  size 
For  i  varying  from  1  to  the  number  of  cameras: 

(1)  Determine  the  direction  vector  to  the  next  camera,  v  =  Mc,+1  ClM. 

(2)  Determine  the  direction  vector  from  the  previous  camera,  v'  = 

(3)  Compute  0i  from  the  inner  product,  9t  =  cos^1  (v,v'). 

(4)  Resample  the  focal  plane  depending  on  whether  6i  is  a  zoom  in  or  a 
zoom  out: 

•  If  0  <  #o  (zoom  out);  in  this  case,  a  <  1. 

(a)  Let  the  original  image  have  R  rows  and  C  columns,  and 
be  denoted  f(r,  c) ,  r  =  1 ,  •  •  •  ,  R,  c  =  1 ,  ■  •  •  ,  C. 

(b)  Create  a  new  image, /'(r,  c),  r  =  1,  •  ■  ■  ,  aR,  c  =  1,  •  •  ■  ,  aC 

(c)  Scan  over  the  new  image,  computing  pixel  values  using 
for  r  =  1  to  R/a,  c=  1  ,C/a, 

f'(r,c )  =  r(/,  ar,  ac)  , 


(8) 
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f 

0 


Figure  3.  Zoom-in:  the  focal  length  is  lengthened  from  /o 
to  /,  moving  the  focal  plane  farther  from  the  focal  point. 
The  length  of  the  physical  focal  plane  is  2xa,  subtending  an 
angle  of  (29q).  (Typically,  29q  might  be  35mm.)  Before  the 
zoom,  if  we  were  to  image  a  point  at  distance  f,  we  would 
see  an  additional  width  of  dx.  However,  when  we  zoom,  and 
actually  move  the  focal  plane  to  /,  we  only  image  a  range  of 
xq,  a  reduction  in  the  field  of  view. 


(9) 


where  T(f,y,x)  is  an  interpolation  function  which  finds 
the  best  estimate  of  the  discretely  sampled  image  /  at  the 
point  y,  x. 

•  If  0  >  (9o  (zoom  in);  in  this  case,  a  >  1. 

(a)  Let  the  original  image  have  R  rows  and  C  columns,  and 
be  denoted  f(r,  c) ,  r  =  1 ,  •  •  •  ,  R,  c  =  1 ,  ■  •  •  ,  C. 

(b)  Create  a  new  image,  f'(r,  c),  r  =  1,  •  ■  ■  ,  aR,  c  =  1,  •  •  ■  ,  aC 

(c)  Scan  over  the  new  image,  computing  pixel  values  using 
for  r  =  1  to  R/a,  c  =  1  ,C/a, 

f'{r,  c)  =  T(f,r/ar,c/a)  , 

where  T(f,y,x)  is  an  interpolation  function  which  finds 
the  best  estimate  of  the  discretely  sampled  image  /  at  the 
point  y,  x. 

When  this  process  is  complete,  the  new  image  f'  will  contain 
objects  whose  scale  is  consistent  with  the  corresponding  objects1 
in  adjacent  images. 


^ince  there  is  no  overlap  in  viewpoints,  the  same  3D  points  will  not  be  imaged  by 
adjacent  cameras.  However,  it  is  quite  possible  to  have  the  left  side  of  a  house  in  one 
image  and  the  right  side  of  the  same  house  in  a  neighboring  image,  and  the  scales  must 
be  consistent. 
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5.3.  Computing  the  Convex  Hull.  The  convex  hull  may  be  computed  in 
two  different  ways:  one  way  is  0(n2)  in  the  number  of  cameras,  and  one  is 
0(n  log  n)  -  faster  but  much  more  complex,  code.  Since  for  this  application, 
we  had  only  around  ten  cameras,  we  implemented  the  simpler  algorithm. 

5.4.  Singular  Formations.  From  Figure  1  we  first  observe  that  the  interior 
of  the  convex  hull  is  not  visible.  This  is  an  artifact  of  the  architecture  of 
the  formation  control  and  the  price  we  pay  for  the  ability  to  look  in  every 
external  direction  without  duplicate  observation  vectors.  This  requires  that 
the  hull  be  strictly  convex,  for  a  singular  condition  occurs  if  cameras  i  and 
i  +  1  are  colinear,  for  any  i.  Thus,  our  algorithm  for  formation  control  will 
need  to  ensure  strict  convexity,  as  will  be  discussed  in  the  next  section. 

5.5.  Formation  Control.  The  robots  may  be  modeled  as  having  mass  and 
friction  and  obeying  Newtonian  mechanics  [18] .  The  robots  move  in  response 
to  applied  forces.  Some  of  these  forces  are  virtual  (computed)  in  order  to 
maintain  the  formation  shape,  and  some  are  external  to  ensure  avoidance  of 
obstacles. 

5.5.1.  Formation  Control  Forces.  Let  Xj,  i  =  1,  •  •  ■  ,n  denote  the  spatial 
coordinate  vector  of  camera  i.  This  is  a  3-vector,  but  may  be  considered  a 
2-vector  if  all  motion  is  restricted  to  the  ground  plane. 

By  xo ,  we  denote  the  coordinates  of  the  special  point  of  attraction  for  the 
swarm.  In  the  absence  of  other  forces,  xo  is  the  center  of  a  regular  polygon 
of  cameras.  Each  camera,  camera  i,  is  attracted  to  the  attractor  with  an 
inward  force  equal  to 


(10)  f0)i  =  A)(xo  “  x*)  • 

We  observe  that,  unlike  gravity,  this  force  grows  stonger  with  distance  in¬ 
stead  of  weaker. 

Each  camera  experiences  a  repulsive  force  from  the  other  cameras,  pro¬ 
ducing  a  net  repulsive  force  of 

n 

(n)  =  &nx.  -xf 

-i  ■  /  •  x. 

Finally,  one  other  force  comes  into  play  to  ensure  the  convexity  of  the 
formation,  a  force  to  ensure  that  the  formation  remains  convex.  To  see  this, 
consider  the  three  cameras  illustrated  in  Figure  4.  We  wish  to  apply  a  force 
to  node  i.  having  coordinates  Xj,  to  prevent  it  from  moving  into  a  position 
where  it  is  colinear  with  x,_i  and  x,+] .  We  accomplish  this  by  first  finding 
the  point  p  which  is  on  the  line  between  x,_i  and  x,+i  and  is  as  close  as 
possible  to  x,; .  The  vector  from  p  to  x$  determines  the  direction  of  the  force 
acting  on  camera  i. 

Since  the  two  vectors  are  orthogonal,  we  have  first 

(12)  (xj+i  -  Xj_i)  •  (p  -  Xj)  =  0 

and  since  the  point  p  lies  on  the  line  between  the  two  points,  it  satisfies 

(13)  p  =  5xj_ i  +  (1  -  <S)xj_|_i 
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Figure  4.  Three  points  on  the  contour  of  a  formation.  To 
maintain  convexity,  the  center  point  must  not  align  with  the 
line  between  its  two  neighbors. 


Figure  5.  The  magnitude  of  the  control  applied  to  a  typical 
robot  in  a  ring  as  a  function  of  the  radius  of  the  ring.  We 
observe  that  the  equilibrium  state  for  this  rings  occurs  at  r= 
3.3. 


In  the  next  section,  it  is  shown  that  <5  is  determined  easily,  making  it  possible 
to  determine  p  using  Equation  13.  Then  the  formation  force  is  defined  to  be 

(14)  fp,i  =  PpijT — - -773 

1 1  Pi  -  Xj||d 

One  may  think  about  the  three  formation  forces  as  artificial  forces,  or  as 
applied  controls.  In  this  paper,  we  do  not  attempt  to  solve  the  n-body 
problem  in  closed  form,  and  instead  state  that  with  no  other  applied  forces, 
we  have  demonstrated  consistetly  stable  behavior  for  any  initial  convex  state. 
Figure  5  shows  magnitude  of  the  net  control  vector  on  one  of  the  robots  in 
an  8-robot  ring,  as  a  function  of  the  radius  of  the  ring. 

5.6.  External  Forces.  Following  the  lead  of  many  investigators,  we  model 
occlusions  by  repulsive  forces.  These  forces  serve  to  deform  the  shape  of  the 
ring,  to  allow  passage  past  individual  occlusions,  narrow  passageways,  etc. 
The  restriction  is  that  the  ring  cannot  deform  sufficiently  to  violate  strict 
convexity. 
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We  assume  that  the  spatial  coordinates  of  occlusions  and  obstacles  may 
be  determined  by  stereopsis[19],  or  other  sensor,  and  do  not  address  the 
sensing  problem  here. 

We  do  make  one  concession  to  computational  complexity:  continuous 
obstacles  are  subsampled  on  the  focal  plane  at  a  density  equal  to  every  k 
pixels,  with  a  minimum  of  two  points,  and  only  those  points  are  used  to 
generate  controls  for  the  ring.  Thus,  if  an  object  subtends  only  a  few  pixels, 
its  control  vector  is  computed  very  quickly. 

We  did  implement  and  test  one  version  of  a  motion  model  with  dynamics, 
but  it  proved  to  be  difficult  to  achieve  stability.  Analysis  of  instability 
proved  to  come  from  two  sources:  1)  this  is  an  n-body  problem.  Each  robot 
exerts  a  force  on  every  other  robot.  There  are  unexpected  minimia,  e.g.  a 
symmetric  ring  of  robots  with  a  single  robot  in  the  center  of  the  ring.  2) 
The  simulation  is  heavily  computational.  In  order  to  achieve  simulations  in 
a  reasonable  time,  the  step  size  needs  to  be  large,  and  the  large  step  size  in 
turn  produces  unstable  behavior. 

Since  the  only  purpose  of  the  control  simulation  is  to  verify  the  vision 
algorithms,  we  simplified  the  dynamics  to  omit  inertia  and  friction,  and 
simply  use  the  model 

(15)  Ax,;  =  afj 

At  each  iteration,  each  robot  simply  moved  an  incremental  distance  propor¬ 
tional  to  the  applied  force.  This  allowed  effective  simulation. 


5.7.  Demo.  To  demonstrate  the  algorithm,  a  powerful  computer  system 
was  set  up,  utilizing  a  6-core  1-7,  with  three  NVIDIA  graphics  processors, 
controlling  a  total  of  five  monitors.  The  operator  sits  facing  the  center 
monitor,  with  monitors  at  ±45°,  and  ±90°.  The  user  then  has  the  sensation 
of  a  full  frontal  and  side  view.  The  display  algorithm  is  demonstrated  to  be 
effective  and  usable. 

Videos  will  be  sent  under  separate  cover  to  the  program  manager 
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Appendix  A.  Derivation  of  5 

Defining  x,;  =  [xi  yi]T,  and  substituting  Equation  13  into  Equation  12, 

ax  i  +  (1  —  a)x3  —  x2 


(16) 

leading  to 


CO 

1 

B 

_ 1 

2/3  —  2/1 

ayi  +  (1  -  a)y3  -  y2 


=  0, 


(17) 


6=  - 


2/12/3  +  X1X3  +  2/22/3  +  x2x3  -  j/i  j/2  -  X1.T2  -x\-yl 
(xi  -  X3)2  +  (2/1  -  y3)2 
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